See copyright notice at the bottom of this page.
List of All Posters
MGL takes on the Neyer challenge (January 13, 2004)
Discussion ThreadPosted 4:02 p.m.,
January 30, 2004
(#59) -
Dr Cane_s Cthulhuite Spawn Counterpart
If anyone is reading this thread anymore, I have another question for MGL &c. Does your system deal with adversarial situations?
For example, imagine a team's 25-man roster is split into a Strong squad and a Weak squad, each with 9 position players and 3-4 pitchers. Which players go on each squad are determined by scouting/scrimming results at the start of the season. When the team plays an opponent with a winning record, it uses it's Strong squad, and when they play opponents with a losing record, it uses it's Weak squad.
Or imagine that every team has Strong and Weak squads. Every day, they choose to play with the same squad as their opposition; on the first day of a series, they play Strong vs Strong, then the next day Weak vs Weak, etc. So all the Strong players never play the Weak players, and vice versa.
In both cases, at the end of the season, the Strong and Weak players might have stats with identical average values and distributions, but actually the Strong and Weak players would have very unequal relative skill/values.
These are obviously 'devil's advocate' situations, but for many individual players (e.g. platoons based on opposing pitcher, pitchers who only use a certain catcher (or a groundball pitchers whose manager backs him with better-fielding but weaker-hitting defenders than the rest of the staff normally gets), teams trying to use/avoid using their ace against a team, pinch hitters brought in mostly in close games = probably against a comparable-quality team, etc.) this sort of thing can happen to an obviously noticeable extent - I mean, managers specifically try to do these things.
Perhaps what I am trying to say is that the method you use puts a lot of stock in the predictive value of the metrics you have - yet in the 'devil's advocate' cases, the metrics are perfectly predictive, but still radically wrong about relative value of players and teams (because the Strong players are rated as being equal with the Weak ones). So how can these methods detect how much devil is in the details? Sure, you can use 'common sense' and look at a manager's strategy and say "obviously, the Strong squads are better", but there is nothing inherent in the method that will notice that.
So how do you know that the method doesn't suffer from these same problems (to a smaller extent) with platoons/aces/etc. that I mentioned above? Or you could imagine that trying to use a single system to rate/value minor and major league players on the same scale would have similar problems. After all, you can't detect the problem by simply looking at predictions, since it will give mostly correct predictions of performance - it will just be consistently wrong about player skill.